Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage in CG #130

Merged
merged 3 commits into from Jun 26, 2017

Conversation

haampie
Copy link
Member

@haampie haampie commented Jun 15, 2017

  • Pre-allocates the vectors, improvement shown below.
  • Removes the dependency of CG on KrylovSubspace (makes little sense for a 3-term recurrence method IMHO). Also: the KrylovSubspace methods are the main reason for inefficient memory usage. For the moment it was easier to fix just CG.
  • Fixes a bug where the CG method starts while the initial guess is the actual solution (I still have to add a test for this)
  • Adds a simple benchmark for the CG method
  • Relies always on A_mul_B! for matrix-vector products. Since we're switching to LinearMaps this shouldn't be a problem.
  • Also: removes the check whether all elements of b are zero. I would say CG should not be responsible for this.

Benchmark results:

LinearSystemsBench.cg()

This PR

BenchmarkTools.Trial:
  memory estimate:  45.80 MiB
  allocs estimate:  437
  --------------
  minimum time:     2.561 s (0.16% GC)
  median time:      2.580 s (0.27% GC)
  mean time:        2.580 s (0.27% GC)
  maximum time:     2.598 s (0.38% GC)
  --------------
  samples:          2
  evals/sample:     1

Current implementation

  BenchmarkTools.Trial:
  memory estimate:  1.53 GiB
  allocs estimate:  1041
  --------------
  minimum time:     3.173 s (9.28% GC)
  median time:      3.246 s (11.98% GC)
  mean time:        3.246 s (11.98% GC)
  maximum time:     3.320 s (14.57% GC)
  --------------
  samples:          2
  evals/sample:     1

@codecov-io
Copy link

codecov-io commented Jun 15, 2017

Codecov Report

Merging #130 into master will decrease coverage by 0.24%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #130      +/-   ##
==========================================
- Coverage   86.54%   86.29%   -0.25%     
==========================================
  Files          18       18              
  Lines        1390     1394       +4     
==========================================
  Hits         1203     1203              
- Misses        187      191       +4
Impacted Files Coverage Δ
src/cg.jl 91.3% <100%> (-6.32%) ⬇️
src/common.jl 78.94% <0%> (-5.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac7bbee...ca67d46. Read the comment docs.

@haampie
Copy link
Member Author

haampie commented Jun 15, 2017

Hmm, I missed the follow up of @shivin9 in #128, but still I think it is better to not use the KrylovSubspace functions and shoot just for A_mul_B!. CG does Krylov subspace things implicitly, and it feels hacky to initialize a KrylovSubspace of order 1.

I'll do the same bench against #128 soon.

@timholy
Copy link
Member

timholy commented Jun 15, 2017

Also: removes the check whether all elements of b are zero. I would say CG should not be responsible for this.

Since there's a test for this, it's fine to change as long as it passes the test. Without that check it used to return all-NaN, but presumably your fix of the starting-point-at-solution bug fixed this.

@haampie
Copy link
Member Author

haampie commented Jun 15, 2017

I think the NaN result is a bug in the current implementation because the initial residual is not checked. With b = zeros(n) and x0 = zeros(n) you start out with a solution, so the iterations should not even start. The current implementation only inspects the residual after the first iteration; this PR does it before the first iteration.

If b = zeros(n) while x0 = rand(n), then CG works just fine.

@lopezm94
Copy link
Contributor

Removing KrylovSubspace should probably be in a new issue and left intact in this PR. Some solvers in this package already accept KrylovSubspace as input, the intention with making the cg implementation dependant was to enable it as input in the future. So if we are going to remove KrylovSubspace, it should be generalized in the whole package.

If I remember correctly there was an idea to remove KrylovSubspace from the package, right @jiahao? Something along the lines of using it only as a store of vectors and providing normalization functions. I could open a new issue about this. This could also improve readability for people who are more accustomed to see the matrix operations inside the solvers.

@lopezm94
Copy link
Contributor

@haampie can you leave KrylovSubspace dependency as it was and rebase? To not delay this PR's merge unnecessarily and then we can discuss KrylovSubspace in general for the package.

@haampie
Copy link
Member Author

haampie commented Jun 23, 2017

This PR doesn't remove the KrylovSubspace type, it removes its usage in cg.jl, so this PR can be merged I guess.

@lopezm94
Copy link
Contributor

lopezm94 commented Jun 23, 2017

Yeah the issue is about removing it's usage in cg, not about removing the type from the package. As I explained before:

Some solvers in this package already accept KrylovSubspace as input, the intention with making the cg implementation dependant was to enable it as input in the future. So if we are going to remove KrylovSubspace, it should be generalized in the whole package.

So the intention was to make cg accept krylovsubspace as input in the future. If we remove it then it can't be done and it should be in a separate issue.

@haampie
Copy link
Member Author

haampie commented Jun 24, 2017

Alright, fixed that

@barche
Copy link

barche commented Jun 25, 2017

Just for reference, I did a comparison with Trilinos.jl, resulting in the following for CG:

  • Your code: 2.7 s
  • Trilinos: 1.633 s

I attempted to reproduce your benchmark as accurately as possible here:
https://github.com/barche/Trilinos.jl/blob/master/benchmarks/belos.jl

@ChrisRackauckas
Copy link
Contributor

Thanks @barche for taking the time to do that. It'll serve as a good performance goal to keep in mind! That shows we're already pretty close to Trillinos which is awesome!

@KristofferC
Copy link

Is Trilinos multithreading the sparse matrix vector multiplication? If it is, one could try using something like https://github.com/JuliaSparse/MKLSparse.jl or https://github.com/dcjones/RecursiveSparseBlocks.jl to see if it is the matvec that makes the difference.

@haampie
Copy link
Member Author

haampie commented Jun 25, 2017

@barche did you run Julia with multithreading enabled? In my experience the Julia version runs somewhat faster if you do BLAS.set_num_threads(1) because of all the BLAS level 1 operations. On the other hand @KristofferC might have a good point as well.

@barche
Copy link

barche commented Jun 26, 2017

@KristofferC No, Trilinos was not multithreading. I'll do some more tests using that when I get back to one of my desktop machines.

@haampie I kept the Julia settings at their default, but switching off threading made no noticeable difference on this 2012 Macbook Air.

@lopezm94 lopezm94 merged commit eb031c0 into JuliaLinearAlgebra:master Jun 26, 2017
haampie added a commit to haampie/IterativeSolvers.jl that referenced this pull request Jun 28, 2017
* Reduce memory usage in CG

* Bring back Krylov subspace again

* Fix deprecation warning
haampie added a commit to haampie/IterativeSolvers.jl that referenced this pull request Jul 10, 2017
* Reduce memory usage in CG

* Bring back Krylov subspace again

* Fix deprecation warning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants